COVID vaccine stigma: detecting stigma across social media platforms with computational model based on deep learning

The study presents the first computational model of COVID vaccine stigma that can identify stigmatised sentiment with a high level of accuracy and generalises well across a number of social media platforms. The aim of the study is to understand the lexical features that are prevalent in COVID vaccine discourse and disputes between anti-vaccine and pro-vaccine groups. This should provide better insight for healthcare authorities, enabling them to better navigate those discussions. The study collected posts and their comments related to COVID vaccine sentiment in English, from Reddit, Twitter, and YouTube, for the period from April 2020 to March 2021. The labels used in the model, “stigma”, “not stigma”, and “undefined”, were collected from a smaller Facebook (Meta) dataset and successfully propagated into a larger dataset from Reddit, Twitter, and YouTube. The success of the propagation task and consequent classification is a result of state-of-the-art annotation scheme and annotated dataset. Deep learning and pre-trained word vector embedding significantly outperformed traditional algorithms, according to two-tailed P(T≤t) test and achieved F1 score of 0.794 on the classification task with three classes. Stigmatised text in COVID anti-vaccine discourse is characterised by high levels of subjectivity, negative sentiment, anxiety, anger, risk, and healthcare references. After the first half of 2020, anti-vaccination stigma sentiment appears often in comments to posts attempting to disprove COVID vaccine conspiracy theories. This is inconsonant with previous research findings, where anti-vaccine people stayed primarily within their own in-group discussions. This shift in the behaviour of the anti-vaccine movement from affirming climates to ones with opposing opinions will be discussed and elaborated further in the study.


Introduction
Very often, vaccination discussions use language projections that transform into strong stigmatised opinions against groups who are involved in healthcare, government institutions or individuals who choose to vaccinate and vice versa with pro-vaccination groups against individuals who do not want to vaccinate. On the other hand such stigmatised sentiments perpetuate antagonism and hostility between pro-and anti-vaccine groups. On the other hand, they reinforce fear and doubt about vaccines' side effects, leading to disputes about effectiveness overall. COVID vaccines are probably not unique in that respect; however, they began attracting negative, discrediting comments long before they were developed, which is probably unparalleled and also a very dangerous development for the pandemic's course. These discrediting comments can be explained by the unconscious tendency to assign more blame and stigma to conditions that seem more threatening and unknown than conditions that are perhaps equally dangerous but are better understood, as was observed in [1] and [2].
This study's results can help to design a model for identifying stigmatised sentiment in discussions about vaccines, both those developed during the ongoing pandemic and those that have been on the market for many years but still face resistance. Building a computational model of COVID vaccine sentiment is not a trivial task. It requires defining the concept of stigma and then, translating it into a computational model that can identify such sentiment in a text. As for the definition of stigma, etymologically, it comes from the Latin stigmat, meaning "mark" or "brand", from the Greek stizein, meaning "to tattoo" and was first mentioned in English texts in reference to a "scar from an iron". However, in modern use, "scar" or "mark" is used in a metaphorical sense to represent a set of negative, often unfair beliefs, and a mark of shame projected by one person or a group of people unto another person or group [3].
Stigma denotes an unusual and negative thing about a signifier that must bear the mark of discredit to identify an abomination of the body, blemishes of an individual character, or a tribal/group taboo [4]. Stereotype represents an oversimplified opinion, also described as the primary rationalisation of displaced frustration [5]. Bias is a personal unreasoned judgement, while prejudice is an irreversible prejudgement directed against a group for their supposed characteristics, expressed through projection, animosity, anxiety, and dichotomisation among others [5]. This study used these concepts interchangeably since they share a similar sentiment and can quickly lead to discrimination given the right conditions [5].
Most stigmas, prejudices, and stereotypes have an inherent element of threat and are characterised by ambivalence and contradictory ideas about someone or something [1]. These opposing ideas may represent two aspects about outgroup members, for example "incompetent but warm" and "competent but not warm" [6]. The out-group can attract subjectively positive feelings that coexist with feelings of antipathy [7]. Members who exhibit both aspects of positive sentiment competent and warm are in-group members, resulting in in-group favouritism [6].
Therefore, it is often very difficult for in-group members of anti-vaccination or pro-vaccination groups to acknowledge anything positive about the out-of-group. Vaccine communities tend to favour information that reinforces their preconceived view, according to the selective exposure [8,9] confirmation bias theories [10][11][12]. The confirmation bias pervasive in those discussions is likely to endorse a hypothesis that conforms to the in-group belief rather than the truth and therefore expressing the truth might mean betraying one's own community [10,12]. People strive for internal psychological consistency to mentally function in the real world, so that people who experience internal inconsistency tend to be psychologically uncomfortable and motivated to reduce cognitive dissonance [13]. Some are so uncomfortable and stressed by such polarised ideas that they resolve the situation by blindly defending the point of view that they want to support. Leon Festinger argued that this especially happens in perturbed situations when disagreement becomes more intense despite all parties being exposed to the same evidence [13]. People also justify their behaviour by rationalisation or avoiding circumstances where they can be confronted with contradictory information or opposing opinions. Comments that members deem offensive and conflicting with the in-group view can result in their deletion or blocking of the contributor to the page [14]. Such anxiety during any interaction with out-group members can be caused by stereotyping, dissimilarity, and lack of contact (keeping only to in-group conversations) [15].
"Anti-vaxxer" accounts on social media sites like Facebook (Meta), Twitter, YouTube, and Instagram reach more than 59 million followers [16]. The 12 biggest accounts are responsible for 65% of the alleged disinformation shared online [17] and spread over a dozen platforms [18], which primarily concerns vaccines developed in Western countries. Some influencers have been offered money to spread misinformation [18]. Additionally, a media company was used as a platform for spreading alarmist headlines about Pfizer vaccine side effects and theories that the public inoculations of politicians are a hoax [19]. Claims that infertility is a side effect from the Pfizer vaccine have been circulating on YouTube since early 2020 and are complicated by the absence of data on the impact of vaccines on pregnant women [20]. High demand for and a low supply of information create an uncertainty vacuum in which conspiracy theories and prejudiced views flourish [5,20]. Johnson et al. suggested that the growth rate of an influential anti-vaccination movement can be curbed if they are intervened with, although the outcomes of intervention had not been researched [21]. In 2019, Facebook (Meta) started removing posts about vaccine hoaxes, across the platform, including in private pages and groups [16]. There were also attempts in mid-late 2020 to ban the most prolific anti-vaccination accounts [22]. In February 2021, Facebook (Meta) widened its ban on vaccine misinformation and pledged to remove claims that vaccines are not effective against diseases, vaccines cause autism, that it is safer to contract COVID-19 than to receive the vaccine and so forth, in effect removing around 2 million pieces of widely debunked content [16,23].
Despite these limitations, anti-vaccination accounts partially bounce back by moving to different platforms or joining forces with other groups, such as antigovernment groups [22], given that the main scapegoats for anti-vaccination communities are government institutions, pharmaceutical companies, and health authorities [24]. Some anti-vaccination contributors get around moderation policies by posting through so-called echo chambers or filter bubbles in comment sections of the news on Facebook (Meta) as they are not subjected to warning labels by third-party fact-checking partners [25].
One of the findings of the current research is that prejudiced sentiment and conspiracy theories about COVID vaccines have been circulating in the comment sections of health authorities that try to disprove COVID-19 conspiracy theories. The latter suggests that efforts to curb anti-vaccination pages have a counter-productive effect and might not be the best strategy for dealing with the anti-vaccination movement. Moreover, it can be seen through observation that anti-vaccine pages on Facebook (Meta) started to form even more tightly-knit, exclusive communities with accounts set to private view. The following research questions aimed to shed light on  COVID vaccine stigma and its features through anti-vaccine  and pro-vaccine discussions on social media domains: Q1. How can rigorous computational model identify COVID vaccine stigma across social media platforms? Q2. Is there a significant computational advantage among the models for identifying COVID stigma in the study? Q3. Which textual features are characteristic of COVID vaccine discourse stigma and which features are preferable in communication on the topic? Q4. Does the COVID vaccine stigma lead to disengagement with content or is the reverse true? Q5. How can the stigma and friction in vaccination discourse be reduced on social media platforms? Why might that be important?
The first research question is addressed in Section 3, Materials and Methods; Section 4, Results; and Section 5, Discussion and Conclusion. The second research question is addressed in the Section 4.3, Classification models; and is concluded in Section 5. The third research question is answered in Section 4.4, Features; and in Section 5. The forth research question is discussed in Section 4.4 and Section 5. The fifth research question is put forward in the Introduction and addressed in Section 5, Discussion and Conclusion.

Literature background
The body of literature was searched for healthcare stigma research conducted on social media sites and online forums from 2015 to 2020. Research conducted by the author was excluded from the initial review. Additional studies were added based on the the key words "COVID stigma", "COVID vaccines" for the period 2020 to 2021. After several screening rounds, out of an initial 5209 studies, 12 studies were included in the final selection, based on their quality and relevance. An additional four studies that discussed COVID vaccines, were also incorporated. The primary focus of current research is to explore studies that either try to identify stigma in social media posts or study stigma from social media content that was directed at certain preventive measures or health-related issues. Five quantitative and seven qualitative/mixed studies were identified from the initially reviewed articles. Among these, 67% of studies [26-28, 31, 32, 35, 36] examined various mental health stigma, [29] analysed suicide stigma, [30] talked about vaccine stigma among mothers, who refuse vaccines for their kids, [37] discussed stigma linked to COVID pandemic, and [33,34] explored weight stigma. The additional four studies about COVID vaccines are primarily theoretical articles. [38,39,41] discussed proand anti-vaccine attitudes. [40] used mixed approach and discussed polarisation of attitudes towards the COVID vaccine based on political affiliation.
Machine learning techniques were applied primarily in quantitative research to build classification models [27-29, 32, 36]. An F1 score of 72.79% was achieved using a Decision Tree technique to classify stigmatising vs. nonstigmatising sentiment and Cohen's k of 0.73 inter-rater agreement [29]. Similarly, [27] obtained F1 of 75.20% using Random Forest model. Comparable or higher F1 measure was achieved in the present study using CNN.
In [32], two researchers manually coded 311 randomly selected tweets and assigned six dimensions with varying degrees of inter-rater agreement. "Metaphorical", "organisation", "informative", "personal", and "joke" were linked to stigma in [32]. Joke, organisation, informative, figure of speech do not always infer stigma. In [36] content analyses of tweets was conducted by one of the authors, where colloquialism was concluded to represent stigma. In such cases, quantitative models will look for colloquialism, metaphor, and so on rather than stigma sentiment. Stigma can be expressed in various linguistic styles, however it does not mean that metaphor or colloquialism should be presented as stigma.
While authors of the article acting as data annotators might lead to better inter-rater agreement, it might also introduce the author's bias into the model, where annotated data has direct influence on the model's outcome. Moreover, according to good annotation practice, measuring inter-rater reliability based on two annotations or assignments per post is rarely considered enough [49]. Each post/comment in the current research was classified on categories of stigma, not stigma, or undefined. The "gray zone" of the undefined category and its features had not been previously studied and is of interest to the current research. Because, the data has three independent expert annotations per post, with the fourth assignment in cases of disagreement, the annotated data can be considered reliable.
The Annotation in the current research is not limited to healthcare context or vaccination discourse and can be applied to studying the concept of stigma across a wide variety of disciplines. A majority (37%) of the studies [26, 31-33, 36, 37] were based on Twitter data. While [27][28][29] derived data from SinaWeibo and [30,39] studied interview questionnaires. [40] discussed stigma based on Facebook data, [35] analysed data from online forum, and [34] explored stigma on YouTube. Both [41] and [38] mentioned various social media sites in the discussion.
To the best of the author's knowledge, the current study introduced the only computational model that can identify COVID vaccine stigma across several social media domains (Facebook, Twitter, YouTube, and Reddit). The differences between those social media domains are substantial in terms of the length of the text, engagement parameters, users, and the way information is communicated and therefore they serve as a good test for the model's performance. The current study fills the gap of reliable, rigorous annotation process and scheme that reflects main research works on stigma and can be applied in other domains beyond the vaccination discourse. Moreover, the study attained good classification result with pre-trained deep learning models together with some traditional models. Models were selected based on the problem description and type of the data with unbalanced classes.

Study design
The main purpose of this study was to build a model that can identify COVID vaccine stigma with high levels of precision and then analyse its outcomes. The present study used a cross-sectional approach, because it was more important to identify the stigma features and differences between stigmatised sentiment and non-stigmatised sentiment in a given period of time rather than to study changes of the concept over time with a longitudinal approach.
The development of an annotation scheme and process necessitated the inclusion of elements of an experimental nature. The initial nine short annotation categories were updated to become state-of-the-art, with fewer categories. This was also a result of continuous feedback from trained annotators and estimates from Cohen's and Fleiss's kappa rates of agreement. The main body of work is non-experimental quantitative and the source data are not tampered with, due to the identification of stigma and its characteristic features being central to the study. This study used an analytic observational and retrospective approach, with elements of a quasi-experiment. Moreover, the study did not collect any sensitive information; therefore, no special permission was necessary to process the data. The data were shared by private individuals on social media pages consensually and publicly.

Data collection
COVID vaccine data were collected in English from social media domains (Reddit, Twitter, and YouTube) retrospectively for the period from April 2020 to March 2021. This includes the time before the COVID vaccine rollout and go through roughly 3 months after the first person was vaccinated with the Pfizer vaccine on the 8 of December 2020 [42]. The collection of data included posts with stigmatised sentiment towards COVID vaccines and the comments, as well as posts that sought to disprove the COVID vaccine conspiracy and its comments.
Reddit posts and comments were collected through PRAW using python script, Twitter data were collected with Octoparse [43], and YouTube data were collected with YouTube Comment Scrapper [44]. To collect data from Reddit, the search phrase "COVID vaccine" was used to compile posts in Conspiracy subReddit. Posts with stigmatised sentiment were selected according to the criteria presented in the stigma annotation scheme shown in Table 1. Criteria for the collection of the content were posts that correspond to the definition of stigma presented in the annotation scheme and corresponding to the minimum of three components of the definition. The latter included but is not limited to blame, conflict (hate, fear), suspicion, rejection, inflexible unfounded overgeneralisation, onesided interpretation, and dichotomisation.
For Twitter, the same search criteria were applied along with the condition that the posts should have accrued a minimum of 50 comments, 50 retweets, 50 likes, and sorted by the "top" posts. The most prevalent topics in COVID vaccine debates on YouTube were about conspiracy and side effects. Therefore, data were collected from YouTube videos with a minimum of 50 comments using the search phrases "COVID vaccine conspiracy" and "COVID vaccine side effects/serious side effects".

Data model and analyses
According to Kang-Xing Jin, head of health at Facebook (Meta), despite all of their screening efforts, vaccine comments are "nuanced", which makes it difficult to discern between people's personal experiences of feeling sick after being vaccinated and content aimed at discrediting and misinforming [23]. Similar challenges were faced in the current study because the main purpose was to discern stigmatised discrediting posts from personal experiences to understand the reasons for polarisation in the vaccination debate, engagement with stigmatised content, and possible ways to narrow the gap of contrariety of opinion between anti-vaccine and pro-vaccine groups. In order to build a model that identifies stigmatised sentiment, the concept must first be defined.
However, in addition to the lack of general consensus among researchers on the definition of the concept, stigma sentiment is multifaceted and thus requires rigor in designing an annotation scheme with definitions and an annotation process. Link and Phelan (2000) pointed out, "The stigma concept has been applied to an enormous array of circumstances. Each one of these is unique and each one is likely to lead investigators to conceptualise stigma in a somewhat different way" [46]. Link and Phelan (2000) elaborated that the concept is multidisciplinary, with contributions from various disciplines. Even within a single discipline, researchers approach the concept from various theoretical angles, which also leads to different interpretations [46].
One of the challenges mentioned by authors in relation to the concept is that interpretation by social science researchers is from the theoretical perspective rather than lived experience [46]. Taking into account the complexity of the concept, during the annotation process, the present study arrived at a construct that includes those characteristics most established by the research community along with feedback from laymen. The convoluted concept based on theoretical frameworks from [1,4,5,[45][46][47] was split into simpler definitions centered around characteristics, which are presented in Table 1.
Most labels stem from the literature; however, the category "personal opinion/projection" was derived through an annotation process and might reflect the lived experiences of the annotators.
Initially, the annotation schema contained nine categories, but it was later clustered into four groups: hostility stigma, overgeneralisation stigma, undefined, and not stigma. The hostility stigma represents a stronger stigma sentiment than the inconsistency/overgeneralisation stigma and was easier to identify in the texts, which is reflected in the better annotation agreement rate for the category. The annotation schema evolved from the process described in [48] to the schema shown in Table 1 with literature definitions and post/comment examples. The comments originate from the YouTube, Reddit, Twitter COVID stigma dataset described in Table 2. The literature references reveal label definitions and clarify reasons for the selection. Each comment was annotated three times, except that a fourth annotation was conducted in the event of lack of consensus on the category assignment. Comments referred to as "markable" were annotated by a set of annotators (c), who assigned labels from a set of categories (k) presented in the annotation schema. Observed agreement (Ao) measured the percentage of judgements on which the annotators agreed when independently coding the same data (divided by the total number of data points) [49]: where: arg i = 1 if the three coders assign i to the same category 0 if the three coders assign i to different categories Eleven annotators were recruited through a personal network. All of them had some social science background. They were of various ages and both genders were represented. Roughly the same number of annotators were recruited through Amazon MTurk. The annotators independently assigned three labels to each comment. Fleiss kappa was an appropriate measure for quantifying the chance agreement that reflects the combined judgements of all of the coders [50]: where P (k) is the expected agreement, i is the total number of assignments, c is the number of coders, n k is the number of times an item i was classified in category k. Fleiss Kappa of 0.84 (P (k)), 89% share of agreement was attained with two annotated labels: "stigma" and "not stigma". Fleiss Kappa of 0.62 (P (k)) and share of agreement 68% was achieved with three classes: "stigma", "not stigma", and "undefined". However, the present study was based on three classes because the gray zone of the undefined class is of interest in terms of its features. The process continued with label propagation and the consequent COVID vaccine stigma feature analyses, as shown in Fig. 1. The data model in Fig. 1 is the process that began with the initial data collection from Facebook (Meta) and the annotation of every post/comment by three annotators, followed by propagating the labels to a larger dataset from Reddit, YouTube, and Twitter. Machine learning models were applied on the propagated dataset to evaluate the traditional model's performance against deep learning models, such as logistic regression, random forest and pre-trained CNN with Glove, FastText, ELMo, and Gensim embeddings. Eventually, features were analysed for each of the stigma, not stigma, and undefined labels, with linguistic and psychological categories from Linguistic Inquiry and Word Count (LIWC) [51]. Additional features Animosity, Condescension, Aggression i) "So today I heard that if you don't have covid and you get tested, they literally put the virus on the swab they test you with to infect you. the goal is to have everyone positive so we're forced to get the covid-19 vaccine (which will have a microchip in it)." ii) "The COVID apartheid is gathering pace Spain intends to set up a registry of people refusing a vaccine. This would be shared with other EU countries. This infringement of civil liberties sets a dangerous precedent. Freedoms lost are rarely regained easily." iii) "Bill Gates says Trump claim about COVID cure is 'inappropriate' Oh so 'doctor' Gates wouldn't be able to sell his dodgy vaccine if Trump's drug works... ". i) Ad-hoc scapegoats might not be lily-white, but they always attract more blame [5]. Frustration generates aggression, which becomes displaced on relatively defenceless goats, is rationalized by blaming, projecting, stereotyping [5]. Suspicion of the out-of-group comes from fear of defeat or by default [45]. Most stigmas hold an element of threat [1]. ii) Evidence about subtleness of stigma suggest that fear, may be part of the sentiment [46]. Externalization of conflict (it is not I who hates and injures others, it is they who hate and injure me) [5]. iii) E. Goffman outlines that one way to express stigma is to point to blemishes of individual character such as weak will, domineering nature, dishonesty, wrong political views etc. [4]. Anger is an emotion directed at a single object, hatred is a sentiment directed at the whole class [5]. Under certain circumstances there will be step-wise progression from verbal rejection to violence [5]. 2. Expressions that sustain inconsistency and over-generalization: (i) Inflexible unfounded overgeneralisation, One−sided interpretation (ii) Predicting, guessing (iii) Unsupported judgement, Personal opinion, Projection (iv) Dichotomization, Tabloid thinking, Demagoguery i) "...Test and Trace -dead Lockdowns -exposed as ineffective Curfews -useless Mass Testing -full of inaccuracies Covid deaths -questionable data Vaccine -rushed and suspect nothing this government and SAGE do has any credibility." ii) "..I can guarantee the brainwashed will be flocking to get the jab! They've probably not made as much money on the flu jabs this year.. Scaremongering!" iii) "So what happens when everyone who gets the vaccine then tests "positive" for "Covid"? I know! The government continues to lock us down, destroys our lives and livelihoods. Oh, plus a "new strain". Rinse repeat until all small business is destroyed and we're all desperate/destitute". iv) "32.7m people have died of HIV/AIDS in the last 35 years 690,000 died in 2019 alone. There is no vaccine for HIV/AIDS despite best efforts over those 35 years COVID though? 6 months and 3 companies have a vaccine which is 90% effective. Sound plausible?" i) If the people being judged are outgroup members, the perceiver will see them as especially similar, lacking in variability [47]. ii) Uncertainty fuels prejudice [5]. There is interest in imaginative processes, in fantasies, in theoretical reflections, in artistic activities [5]. iii) Based on the input from annotators: "personal opinions and projections which are not substantiated feel like stigma/prejudice". Favorableness or unfavorableness that accompanies unsupported judgement and is not based on previous experience [5]. iv) Prejudiced person is given two valued judgement and dichotomizes when things of nature, of law, of morals [5]. Demagoguery justifies and encourages tabloid thinking, stereotyping, and conviction that the world is made up of swindlers [5].
3. Lacking context to make a decision "I'm a little confused. I thought Kennedy wasn't for forced vaccinations." "I can't even breathe w one freaking mask. Ridiculous." "Then you should have no worries volunteering yourself ... take the trial vaccines as you know so much about vaccines abi?" 4. Not stigma "Please Sir, what other option do you have aside vaccine?" "Herd immunity for thee, vaccine for meeeee." "Heard from who? Sources? Proof?" in the model include scores for sentiment polarity, subjectivity, and engagement. Analysis of variance (ANOVA) F value was used to determine if the continuous variables/features were significant for the classification task (i.e how well they discriminate between multiple classes). The 30 best features were selected with SelectKBest in scikit-learn. The ANOVA F value formula is as follows F − value Anova Formula [52]: where SSE is the residual sum of squares, m is the number of restrictions, and k is the number of independent It was implemented with scikit-learn using the RFE algorithm implementation presented in [53]. The z-score calculates how many standard deviations above or below the population mean a data point (feature) is. z-score Formula: The emotional tone feature of stigmatised sentiment deviates from the general emotional tone for the total population according to the data displayed in Table 4 and Section 4, stigmatised sentiment is expressed in less emotion (a negative z-score value of − 5.9594).

Results
The dataset displayed in Table 2  Undefined sentiment means anything difficult to construe and assign to either category. Engagement values are based on Likes (Twitter), Retweets (Twitter) and Upvotes/Downvotes (Reddit, YouTube), with the latter showing both negative and positive engagement. On average, stigmatised posts from Reddit attracted more comments and were also more extensive than comments on Twitter and YouTube, as seen in Fig. 2. The relatively shorter length of comments on Twitter is due to the limit of 280 characters [54], with only 12% of comments being longer than 140 characters. However, the character limit on YouTube is set to 10,000, so it is perplexing why comments are so brief on this site [55]. The number of tweets is limited to 2,400 per day and comments to YouTube posts are limited to 500 [56].
Several different types of stigma, such as subtle generalisations and expressions that sustain hostility, are likely to be included in one long comment to a stigmatised post on Reddit, which is different from comments on Twitter or YouTube. This could further explain why a higher proportion of stigma sentiment was discovered on the Reddit platform in comparison to the other two platforms (the findings also suggest that stigmatised sentences tend to have more characters). The COVID stigma sentiment was identified through label propagation from a smaller annotated dataset of vaccination discourse on Facebook (Meta), which comprises 2,761 comments containing antivaccination and pro-vaccination sentiment and about 60% of comments showing stigma/prejudice/stereotype. The process is described in greater detail in [48] and [24].

Dataset
All of the posts/comments in the current study (Table 2) were collected around June and July 2020, after attempts were made by social media companies to close antivaccination accounts. However, some anti-vaccination pages are still in existence. In particular, a search of Facebook (Meta) using the keywords "vax" and "vaccine/s" returned 117 accounts primarily discussing vaccines prior to the start of the COVID-19 pandemic in February 2019.
Out of 117 accounts, 60.68% were anti-vaccine pages (no. followers: avg.  However, the most staggering change was in the number of followers: there was a drastic decrease in the number of followers to anti-vaccination pages and increase in the number of followers of pro-vaccination pages. The sharp drop in anti-vaccine followers suggests that influential pages were deplatformed from Facebook (Meta), which is consistent with [22]. The increase in the average and maximum values of pro-vaccination pages suggests that more followers joined those discussions. Tables 3 and 4 show stigmatised sentiment as a share of the total data in response to both COVID vaccine posts and disproving COVID vaccine conspiracy posts. The slightly higher engagement with stigmatised antivaccine posts in Table 3 can suggest that in-group support still prevails. Varied engagement with posts that try to disprove conspiracy, shown in Table 4, may suggest a diversity of users that read the content, including out-of-group members. Engagement value is a supplementary feature included in the dataset with the goal of understanding the impact of stigmatised and non-stigmatised, anti-vaccination and disproving COVID vaccine conspiracy sentiments. It is also the most viable and direct method of studying the impact of sentiment on social media platforms. The feature combines the following scores: downvotes and upvotes (Reddit), likes (Twitter, YouTube), dislikes (YouTube), comments (Reddit, Twitter, YouTube), retweets (Twitter). Negative engagement primarily stems from Reddit and YouTube platforms through downvotes and dislikes, respectively.

Comment examples
To understand the content of each of the classes, replies to posts are presented in Appendix A and Table 5. Six replies to each post were randomly chosen from the COVID anti-vaccine dataset to show instances of stigma, not stigma, and undefined sentiment. Similarly, randomly chosen examples from the disproving conspiracy posts and their comments are presented in Table 5 and Section 4.2. COVID anti-vaccine posts are those discussing conspiracy topics and expressing fear about vaccination side effects. Conspiracy sentiment includes discussions of the primary agenda behind the vaccine, speculation over identification devices in the vaccine, labelling of the pandemic as a fraud meant to eliminate small businesses, declarations that there is a 5% death rate from the vaccine without factual evidence, and vaccine development with the purpose of speculative market index.
The replies to COVID vaccine stigma posts that carry stigma sentiment (Appendix A) present the following main topics: i) agenda imposed by WHO, ii) Bill Gates, iii) the CDC, iv) population control, v) sterilisation, vi) becoming a lab rat, vii) denying the existence of a vaccine to prevent COVID, viii) calling the COVID pandemic imaginary, ix) demagoguery to reject the vaccine, x) death as a COVID vaccine side effect, xi) population control, xii) mark of the devil, xiii) agenda that has been forced by big corporations, xiv) people who get the vaccine will die, xv) calling the vaccine a murder weapon, xvi) announcing the existence of Microsoft microchips in the vaccines, and xvii) claims that the people who run the world are holding back the vaccines.
Not stigmatised replies to COVID vaccine stigma posts had the following topics: i) gratitude for the content posted, ii) explanations of what 95% vaccine effectiveness means, iii) expressions of worry about being unable to say no to the vaccine, iv) explanations about vaccine trials, v) suggestions of resources with factual evidence on the science behind the vaccines, vi) asking constructive questions, and vii) discussions of personal experiences. In addition, background was often given for the information provided.
Replies that were labelled as undefined (neither sentiment was identified) were i) asking rhetorical questions, ii) providing puzzling statements that can be interpreted as both carrying and not carrying stigma sentiment, and iii) comments hinting at a vaccine agenda, making a joke about it, or asking a question in order to understand the situation better. Discussions about serious side effects included suspicion that the vaccine was not properly tested or questioning vaccine trials, expressing fear of making it mandatory, and distrust in the effectiveness of the vaccine due to its speedy development.
Disproving COVID vaccine conspiracy posts were focused on disproving unconventional falsehoods, such as female sterilisation and challenges to concerns about the vaccines' safety. Replies to those posts are displayed in Table 5. Replies that carry stigma sentiment exhibited the following main topics: i) drug companies being protected against legal liability, ii) uncertainty in relation to pregnant women and the long-term effect on their children, iii) beliefs that safety takes decades to determine, iv) "anything" can be placed in the vaccines, v) no responsibility for serious side effects, vi) lack of information about possible lethal side effects 28 days after the vaccination, and vii) the probability that mRNA vaccines lead to cancer and changes in genes.

Classification models
Most comments correspond to the propagated label (stigma, not stigma, undefined), but some comments were misclassified. Before evaluating how well the models would perform,  the text was split into bi-gram features in order to receive more meaningful segments of the data that would potentially lead to a more straightforward interpretation. Then, the score for each bi-gram unit was calculated to establish its importance in the corpus. Terms that appeared in fewer than five documents (posts/comments) were ignored. Traditional models were applied to the data, and the results were compared with pre-trained deep learning models. Logistic regression can achieve comparable or better classification results on simpler tasks than can neural networks. However, the former can skew the result for the majority class on imbalanced data. Therefore, parameters need to be modified to take skewed distribution into account. Support vector classification is a superior technique to naive Bayes for text classification tasks. It achieved a better performance than logistic regression or naive Bayes; it also does not require tuning of the parameters. Moreover, random forest classifier (balanced subsample) is better suited for the classification task on an imbalanced dataset, because the undefined class is much smaller than the stigma and not stigma classes.
CNN is a good technique for some image recognition tasks; however, it can lead to over-fitting in text classification. To test the model's performance and accuracy of  Deep Learning (Table 6)  Similarly, in the disproving conspiracy data (Table 7), deep learning outperformed traditional models. The F 1 measure is much higher when comparing the performance of deep learning models (X = 0.79, SD = 0.003), All models were evaluated ten times by boostrapping on Disproving COVID vaccine stigma posts and their comments. The mean of achieved accuracy is reported for each model. CNN significantly outperformed baselines (traditional models), as per a paired sample t-test (p < 0.05), assuming unequal variances tStat = −4.06, one-tail P(T≤t) 0.003, two-tail P(T≤t) 0.006) with the performance of traditional models (X = 0.73, SD = 0.041). The null hypothesis should be rejected, as the classification accuracy of deep learning models is substantially higher than the accuracy of traditional models, which answered Q2. An F 1 score of 0.764 (as seen in Covid Vaccine Stigma, Table 6) was achieved with a CNN that was pre-trained on FastText WikiNews-300d-1M. FastText WikiNews-300d-1M contains 1 million pre-trained word vectors with 300 dimensions (features) that was trained on the Wikipedia 2017 data, UMBC webBase corpus, and statmt.org news dataset.
An F 1 score of 0.794 (as seen in Disproving COVID Vaccine Stigma, Table 7) was achieved with a CNN that was pre-trained on Glove.6B.50d. Glove.6b.50d contains 400,000 pre-trained word vectors on Wikipedia 2014 data and Gigaword5 files. It also contains 6 billion tokens, 400,000 of uncased vocabulary, and 50 (features) dimension vectors. Evidence that the CNN model achieved F 1 precision of 0.794 on the identification/classification task suggests that the propagation task (on the stigma, not stigma, and undefined labels) and model for identifying subtle stigma sentiment were implemented effectively and perform well.

LIWC variables
Prior to the development of LIWC, Walter Weintraub handcounted people's words in medical and political speeches and linked them to emotional states of the person [64]. Weintraub was fascinated by how people use language. He associated an impulsive personality trait and binge eating disorder with frequent used words "but", "nevertheless", "however". People with those disorders act impulsively, and it is reflected in their speech when they use such terms to try to remedy the consequences of an impulsive action by taking back the statement. Similarly, persons with compulsive repetitive behavior try to justify such acts using expressions such as "because", "therefore", and "in order to" [64].
Weintraub's method of analysis looked for verbal categories such as qualifiers ("think", "kind of", and other filler words) that are inversely related to preparation; retractors suggest difficulty in adhering to previous decisions ("however", "but"); personal pronouns present an individual ("I"), a mutual course ("we"), and a more passive speaker ("me"); negatives suggest stubbornness, opposition, or the use of coping mechanisms ("not", "never", and "nothing"); and adverbial intensifiers produce dramatic effect and are used by teenagers more than other age groups ("very", "really", "so", "such") [64]. Furthermore, verbal categories were also associated with personality traits. Decisiveness was connected with high frequency use of qualifiers, an angry disposition was associated with an increase in negatives, as much as five times that of normal speech, an increase in the use of rhetorical questions and direct references [64].
LIWC ("Luke") was developed similarly, with the initial goal of efficiently counting words in psychologically or grammatically-relevant categories across multiple text files. Central to the analysis are LIWC dictionaries with collections of words that define categories [65]. All the relevant categories are listed, and the percentages for each category are given per post/comment, based on the total number of words in post/comment (analysis concerned social media data). Some LIWC categories are rather straightforward, such as articles, which consists of three words ("a", "an", "the"), whereas other social and emotional processes are more complex, such as where three researchers had to agree on the assignment of words to those categories [65].
From its first version, LIWC 1997 [66], to the LIWC 2015 [67] version, LIWC software studies social, psychological, and linguistic processes in an efficient way. The LIWC feature analyses based on a written text can reveal a lot about an author or historical figure quickly and correctly, adding to a description by historians. Furthermore, the latter can also carry bias.
For example, the use of more tentative language, such as filler words, suggests that a person is uncertain/insecure about the topic. Negative emotions, death references, and first-person singular can suggest that a person is depressed, with suicidal thoughts [65].
There are various research articles that successfully apply LIWC features to perform correlation, classification type tasks [28,68,69], and prediction type tasks [70].
Schizophrenia stigma in [28] was studied with 27 LIWC features and was associated with social processes, humans, death, and anger. Similarly, character traits such as narcissism have been analysed with 72 linguistic features from LIWC 2001, using weighted Pearson's correlation technique [68]. The results showed a positive connection between narcissism, sexual references, swear word use, and a negative association with anxiety. LIWC features also helped to classify positive and negative sentiment from social media opinion posts [69]. High classification accuracy scores on the task were achieved with the following features: psychological processes, relativity, and personal concern.
Furthermore, prediction of the final course performance based on written self-introductions by students was described in [70]. Here, 84 of the LIWC features were gradually reduced to 20 based on the correlation with the final grade. Analysis was based on 321 written self-introductions and concluded that egocentrism and acting-in-the-present were linked with poor performance on the exam (prevalence of personal pronouns, use of verbs, and present tense words).
The current study includes features from the LIWC 2015 version, together with five other features that were defined in the research and are presented in Appendix B. The variables in Appendix B help us to understand the social, emotional, and linguistic composition of the COVID vaccine stigma sentiment with the most relevant features of the model discussed in Section 4.4.2.

Features of the model
The 30 most significant features in Tables 8 and 9 were derived from variables in Appendix B and are based on ANOVA F-test and RFE ranking. The latter identified the informative features, and the ANOVA F-test determined whether there was any statistically significant difference between mean values of features and annotation labels (classes) and how well a given feature discriminated between multiple classes.
The z-score indicates how much the labelled classes can vary from the population mean. Certain features show polarised development of stigmatised comments versus not stigmatised comments for both the COVID vaccine stigma and disproving COVID vaccine conspiracy datasets.
Sentiment score (polarity on negative and positive sentiment), subjectivity, and engagement are additional features that are not part of LIWC variables.
Sentiment feature shows negative score for stigmatised comments and positive for not stigmatised. Subjectivity  is naturally higher for stigmatised content, and is also confirmed by the findings in Tables 8 and 9. Stigmatised sentiment is expressed in lengthier sentences/comments, which is presented through the high positive z-scores of words-per-sentence feature. Stigmatised sentiment is also seen in lengthier posts/comments (word count/no. characters feature). There is more stigmatised communication than not stigmatised. Function words that reflect the attitude or mood of a speaker are more frequent in stigmatised comments, which focus on the present time and exhibit the characteristics of negative emotions (such as anxiety and anger) and the use of swear words. Prevalence of present tense suggests greater psychological connection and continuation of the concern.
References to risk and danger are common, as are references to out-groups ("they/them" vs. "us"). Stigmatised sentiment is expressed with less emotion, which can suggest lesser involvement with the topic and features excessive use of auxiliary verbs ("may", "must", "should").
Conversely, perceptual processes (selecting, organising, and interpreting information) and work references are common in not stigma sentiment.
Not stigmatised comments/sentences are succinct, but they employ lengthier informal words, which suggests that more complex words are used. Moreover, not stigmatised sentiment is expressed in an emotional, authentic, and positive tone that is simultaneously analytical. Emotional tone can suggest greater immersion in the topic. In contrast to stigmatised sentiment, risk, danger, anger, and references to health, anxiety, and other negative emotions, such as swear words, are rare in not stigmatised sentiment.
The engagement feature was log normalised to remove skewness from the highly variable data and is based on downvotes, upvotes, likes, dislikes, comments, and retweets. Engagement is important for the study as it can show different levels of participation in vaccine discussions. The RFE ranking deems the feature to be relevant for Reddit [71] stigma detection. However, z-score and ANOVA F-score did not detect any significant variances in engagement across stigma class labels.
Unsupervised learning K-means clustering can serve as an additional visual interpretation of the features of the model. According to the distribution of the data in Fig. 3, stigmatised posts have higher word counts/are lengthier than not stigmatised posts, which is supported by the z-score findings in Tables 8 and 9. Stigmatised posts/comments receive mixed response (engagement), similar to not stigmatised posts; however, some show especially high engagement. From the observation of the study, the connection between engagement and stigma depends on the context. For example, in the in-group anti-vaccine discussions, stigmatised posts received more attention and consequently reported high positive engagement. Conversely, not-stigmatised posts are more emotional and authentic, using informal language, which can draw attention to the post in other contexts. However, further discussion on the topic of engagement is outside the scope of the current research and will be discussed in future work.

Visual analyses: co-occurance network
To further visualise the comment responses to COVID vaccine stigma and disproving conspiracy posts, a co-occurrence network of words was applied with term frequency (69) and document frequency (1). To measure the strength of edges, the Jaccard coefficient was applied with the top 77-105 words presented. Darker lines and higher coefficients show stronger edges (coef. ≥0.1).
The stigmatised Reddit posts in Fig. 4 show representative words such as "big", "business", "covid", and "produce", suggesting a fair share of the discussion is attributed to big business and its role in the pandemic.
"COVID" is characteristic of both stigmatised posts and comments. Stigmatised comments echo some of the sentiment from the posts with references to "government", "kill", "covid", and "vaccine". Central in the discussions is criticism of governments and warning against side effects of the vaccines. References to "kill", "die", and "death" under the topic of vaccines suggests fear and depressive moods of the people who wrote the comments.
The YouTube anti-covid vaccine posts shown in Fig. 6 make references to "Pfizer", "Covid", and "Gates". All comments mention vaccine to a lesser or greater degree; stigmatised comments also make references to "Gates", "mark", "beast", and "chip".
The co-occurrence network of words, at times, provides us with an ambiguous, breviloquent idea about the main sentiment and topics discussed within a certain context. Correspondingly, sentiment gleaned from visual analyses provides us with a vague yet apropos conceptualisation of the stigma, not stigma, and undefined classes. The posts shown in Fig. 7 discuss vaccine conspiracy and alleged effects of the new vaccines, such as DNA-related risks, along with other concerns about side effects connected with the Moderna vaccine. Doctor Northrup-a known figure in the anti-vaccine movement-is frequently mentioned in the posts trying to disprove a COVID vaccine conspiracy. Stigmatised responses mention population control, forced practices, and appeal to freedom of choice in the arguments. Stigmatised comments also question the effectiveness of the vaccines and suggest that the vaccines did not go through proper development and testing procedures in such a short time frame.

Discussion and conclusion
This paper presented a computational model for identifying COVID vaccine stigma across social media platforms and addressed how to build such a model. To the best of the author's knowledge, this is the first time a computational model of vaccination discourse has been designed and the first research on COVID vaccines based on four social media platforms. Numerous annotators were involved in the process and several approaches were tested before each comment was annotated; consequently, labels propagated to a larger dataset. The goal of the model was to test how robust and reliable the model would be once classes were propagated from the vaccine discussions on Facebook (Meta) dataset to the COVID vaccine discussion on Twitter, YouTube, and Reddit dataset.
Without a rigorous impartial annotation process, annotation scheme, the identification of such a nuanced concept as stigma would be unlikely, and the identification of sentiment would be completed with much less accuracy. All classification models achieved high levels of accuracy, but there is a statistically significant computational advantage in the performance of deep learning models. The deep learning models with pre-training significantly outperformed traditional classification models and successfully identified stigmatised sentiment.
Features of the stigma and not stigma classes are quite indicative of the annotation label assigned. In particular, stigma sentiment in COVID vaccine discussion is expressed in the following characteristics traits: i) lengthier sentences, ii) showing negative sentiment, emotions of anxiety, anger, and those connected with risk, as well as the use of swear words, iii) is less analytical, iv) uses more auxiliary verbs such as "must", "should", and "can", and v) employs a relatively reserved tone. Prejudiced sentiment leads to ignorance, hostility, and barriers to communication: "Erroneous ideas, Spinoza observed, lead to passionfor they are so confused that no one can use them as a Fig. 7 Disproving COVID vaccine stigma: YouTube [71] basis for realistic adjustment. Correct and adequate ideas, by contrast, pave the way for a true assessment of life's problems" [72].
Therefore, neutral and not stigmatised sentiment is preferable, especially with polarised topics such as vaccines. This calls for the characteristics of i) shorter sentences, ii) more analytical features, iii) an authentic tone, iv) positive emotions void of anxiety, anger, and risk, with no use of swear words, and v) an informal tone, void of discrepancy, and differentiation. Stigmatised sentiment in COVID vaccine discourse does not lead to negative engagement with the content and the study did not find engagement to be a relevant feature in identifying stigma sentiment in COVID vaccine discourse. This could be explained by the mixed reaction to public posts/comments from antivaccine and pro-vaccine communities. Stigmatised antivaccine posts/comments might be considered engaging among like-minded in-group members, but might receive negative reactions from the pro-vaccine community and show neutral engagement on the balance.
This study found that anti-vaccine sentiment is often present in the comments as responses to disproving conspiracy posts. This finding is unexpected, given that previous work discovered antagonists (anti and pro-vaccine movement) concentrated primarily within their own public groups on Facebook (Meta), with homogeneous position on the topic of vaccines and abstinence from out-ofgroup activity [14,24]. Such contradictory evidence may be in connection with the special circumstances of the COVID pandemic, where COVID anti-vaccination pages and posts were removed whereas some groups banned across social media platforms. In response, the COVID anti-vaccine movement rebounded by moving to provaccine channels, argued conspiracy theories and general stigma beliefs in response to statements attempting to disprove them. Some form of contact between COVID provaccine and COVID anti-vaccine groups had thus been established.
Government attempts to de-platform the anti-vaccine movement did not succeed, but, instead, led to involuntary contact of the two groups. However, whether it was the right type of contact to reduce prejudice and prevent vaccination conspiracy theories, at least on a smaller scale, or if it provoked an even greater divide should be examined further. According to Gordon W. Allport, prejudice results from the lack of dialogue, lack of contact [5], and the antipodal stance can arguably be lessened when polarised groups are brought together [9,[72][73][74][75][76][77]. In his 1954 work, Gordon W. Allport also stated that prejudice between an in-group and an outgroup may be reduced under certain conditions [72]. The effects of the contact will be enhanced if it is encouraged by law, customs, or given general conditions for the contact hypothesis to succeed: equal background, mutual goals, intergroup cooperation, and acknowledgement of authority that supports the interaction [72].
Elliot Aronson cultivated additional conditions: mutual interdependence, opportunity for frequent contact, and social norms that support such interactions [73]. Pettigrew et al. (2011) highlighted other positive outcomes of intergroup contact, such as greater trust and forgiveness of past transgressions [77]. Other researchers have indicated that effects generalise beyond immediate out-group members; are present across age ranges, genders and nations; and are related to not only ethnicity but also take place regarding healthcare and social issues [77]. Therefore, one can presume that the hypothesis generalises well for pro-vaccine and anti-vaccine groups. However, McClendon (1974) argued that one type of contact alone is not sufficient for optimal prejudice reduction and suggested a combination of Allport-Pettigrew theory and the theory of superordinate goal achievement [72,[77][78][79].
Unfortunately, all those special conditions seem to be very difficult without serious supportive initiatives. Moreover, there is also a number of authors who have argued that reduction in prejudice is possible only on a smaller scale [80,81]. Amir (1969) argues the opposite effect from contact under unfavourable conditions [80]. Consequently, it can be a matter of future work to establish the optimal conditions for prejudice reduction and ways to create a constructive dialogue between anti-COVID and pro-COVID vaccine communities. Nevertheless, constructive dialogue is important due to opposing views in the emotionally charged case of anti-vaccine campaigns that continue to pose a challenge to the efforts of public health authorities.
The issue is not likely to subside by removing antivaccine groups from social media platforms, as those messages nevertheless find their way back, according to the findings in the current research and analyses from [22,25]. Rifts between members of anti-vaccine, provaccine movements, and polarised groups in the broader context, lead to irresolution, mockery, distrust, friction, antagonism, and destabilising situations in society as the long-term result. The findings in this research can guide the choice of impartial, unbiased communication features in the future where it can possibly motivate concordant action, successful execution of commitment to reduce the dissonance, and establish constructive dialogue between polarised vaccine groups.  2. SHHHHHHH let them be stupid. Funding This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Data Availability
The data supporting the reported results can be obtained from the author upon reasonable request.

Declarations
Ethics approval and consent to participate Ethical review and approval for the study was disregarded because analyses were performed on social media text that did not contain any personal, sensitive information about or with reference to human subjects. Moreover, the data studied were shared publicly on social media domains by the users, who consented to unrestricted dissemination. Nevertheless, if information about user accounts appeared in the data, it was anonymised.

Conflict of Interests
The author declares no conflicts of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons. org/licenses/by/4.0/.